10 research outputs found

    Dynamic Parameter Allocation in Parameter Servers

    Full text link
    To keep up with increasing dataset sizes and model complexity, distributed training has become a necessity for large machine learning tasks. Parameter servers ease the implementation of distributed parameter management---a key concern in distributed training---, but can induce severe communication overhead. To reduce communication overhead, distributed machine learning algorithms use techniques to increase parameter access locality (PAL), achieving up to linear speed-ups. We found that existing parameter servers provide only limited support for PAL techniques, however, and therefore prevent efficient training. In this paper, we explore whether and to what extent PAL techniques can be supported, and whether such support is beneficial. We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse, and experimentally compare its performance to existing parameter servers across a number of machine learning tasks. We found that Lapse provides near-linear scaling and can be orders of magnitude faster than existing parameter servers

    Streaming Data through the IoT via Actor-Based Semantic Routing Trees

    Get PDF
    The Internet of Things (IoT) enables the usage of resources at the edge of the network for various data management tasks that are traditionally executed in the cloud. However, the heterogeneity of devices and communication methods in a multi-tiered IoT environment (cloud/fog/edge) exacerbates the problem of deciding which nodes to use for processing and how to route data. In addition, both decisions cannot be made only statically for the entire lifetime of an application, as an IoT environment is highly dynamic and nodes in the same topology can be both stationary and mobile as well as reliable and volatile. As a result of these different characteristics, an IoT data management system that spans across all tiers of an IoT network cannot meet the same availability assumptions for all its nodes. To address the problem of choosing ad-hoc which nodes to use and include in a processing workload, we propose a networking component that uses a-priori as well as ad-hoc routing information from the network. Our approach, called Rime, relies on keeping track of nodes at the gateway level and exchanging routing information with other nodes in the network. By tracking nodes while the topology evolves in a geo-distributed manner, we enable efficient communication even in the case of frequent node failures. Our evaluation shows that Rime keeps in check communication costs and message transmissions by reducing unnecessary message exchange by up to 82:65%

    Monitoring of Stream Processing Engines Beyond the Cloud: An Overview

    Get PDF
    The Internet of Things (IoT) is rapidly growing into a network of billions of interconnected physical devices that constantly stream data. To enable data-driven IoT applications, data management systems like NebulaStream have emerged that manage and process data streams, potentially in combination with data at rest, in a heterogeneous distributed environment of cloud and edge devices. To perform internal optimizations, an IoT data management system requires a monitoring component that collects system metrics of the underlying infrastructure and application metrics of the running processing tasks. In this paper, we explore the applicability of existing cloud-based monitoring solutions for stream processing engines in an IoT environment. To this end, we provide an overview of commonly used approaches, discuss their design, and outline their suitability for the IoT. Furthermore, we experimentally evaluate different monitoring scenarios in an IoT environment and highlight bottlenecks and inefficiencies of existing approaches. Based on our study, we show the need for novel monitoring solutions for the IoT and define a set of requirements

    IoT-PMA: Patient Health Monitoring in Medical IoT Ecosystems

    Get PDF
    The emergence of the Internet of Things (IoT) and the increasing number of cheap medical devices enable geographically distributed healthcare ecosystems of various stakeholders. Such ecosystems contain different application scenarios, e.g., (mobile) patient monitoring using various vital parameters such as heart rate signals. The increasing number of data producers and the transfer of data between medical stakeholders introduce several challenges to the data processing environment, e.g., heterogeneity and distribution of computing and data, lowlatency processing, as well as data security and privacy. Current approaches propose cloud-based solutions introducing latency bottlenecks and high risks for companies dealing with sensitive patient data. In this paper, we address the challenges of medical IoT applications by proposing an end-to-end patient monitoring application that includes NebulaStream as the data processing system, an easy-to-use UI that provides ad-hoc views on the available vital parameters, and the integration of ML models to enable predictions on the patients' health state. Using our end-to-end solution, we implement a real-world patient monitoring scenario for hemodynamic and pulmonary decompensations, which are dynamic and life-threatening deteriorations of lung and cardiovascular functions. Our application provides ad-hoc views of the vital parameters and derived decompensation severity scores with continuous updates on the latest data readings to support timely decision-making by physicians. Furthermore, we envision the infrastructure of an IoT ecosystem for a multi-hospital scenario that enables geo-distributed medical participants to contribute data to the application in a secure, private, and timely manner

    NebulaStream: Complex Analytics Beyond the Cloud

    Get PDF
    The arising Internet of Things (IoT) will require significant changes to current stream processing engines (SPEs) to enable large-scale IoT applications. In this paper, we present challenges and opportunities for an IoT data management system to enable complex analytics beyond the cloud. As one of the most important upcoming IoT applications, we focus on the vision of a smart city. The goal of this paper is to bridge the gap between the requirements of upcoming IoT applications and the supported features of an IoT data management system. To this end, we outline how state-of-the-art SPEs have to change to exploit the new capabilities of the IoT and showcase how we tackle IoT challenges in our own system, NebulaStream. This paper lays the foundation for a new type of systems that leverages the IoT to enable large-scale applications over millions of IoT devices in highly dynamic and geo-distributed environments

    Query Execution on Modern CPUs

    Get PDF
    Über die letzten Jahrzehnte haben sich Datenbanken von festplatten-basierten zu hauptspeicher-basierten Datenbanksystemen entwickelt. Um diese Herausforderungen anzugehen und das volle Potenzial moderner Prozessoren zu erschließen, stellt diese Dissertation vier Ansätze vor um den Einfluss der „Memory Wall“ zu reduzieren. Der erste Ansatz zeigt auf, wie spezielle Prozessorinstruktionen (sogenannte SIMD Instruktionen) die Ausnutzung von Caches erhöhen und gleichzeitig die Anzahl der Instruktionen verringern. In dieser Arbeit werden dazu vorhandene Baumstrukturen so angepasst, dass diese SIMD Instruktionen verwendet werden können und somit die benötigte Hauptspeicherbandbreite verringert wird. Der zweite Ansatz dieser Arbeit führt ein Model ein, welches es ermöglicht die Anfrageausführung in verschiedenen Datenbanksystemen zu vereinheitlichen und dadurch vergleichbar zu machen. Durch diese Vereinheitlichung wird es möglich, die Hardwareausnutzung durch Hinzunahme von Wissen über die auszuführende Hardware zu optimieren. Der dritte Ansatz analysiert verschiedene Datenbankoperatoren bezüglich ihres Verhaltens auf verschiedenen Hardwareumgebungen. Diese Analyse ermöglicht es, Datenbankoperatoren besser zu verstehen und Kostenmodelle für ihr Verhalten zu entwickeln. Der vierte Ansatz dieser Arbeit baut auf der Analyse der Operatoren auf und führt einen progressiven Optimierungsalgorithmus ein, der die Ausführung von Anfragen zur Laufzeit auf die jeweiligen Bedingungen wie z.B. Daten- oder Hardwareeigenschaften anpasst. Dazu werden zur Laufzeit prozessorinterne Zähler verwendet, die das Verhalten des Operators auf der jeweiligen Hardware widerspiegeln

    Los efectos del cumplimiento de la condicio\u301n. Las rai\u301ces romanistas y el re\u301gimen en las codificaciones contempora\u301neas

    No full text
    El ensayo centra su atencio\u301n, desde una perspectiva comparativa sincro\u301nica y diacro\u301nica, en los efectos de la realizacio\u301n de la condicio\u301n, con particular referencia a los ordenamientos juri\u301dicos europeos y latinoamericanos. Algunos ordenamientos acogen, en esta materia, la regla de la retroactividad (eficacia ex tunc), mientras que otros hacen lo propio con la regla opuesta de la irretroactividad (eficacia ex nunc). esta duplicidad de orientaciones reconoce sus antecedentes ma\u301s remotos en el derecho romano. en efecto, la lectura de las fuentes induce a considerar que para los juristas romanos \u2013 guiados por el habitual pragmatismo, dirigido a adaptar las soluciones propuestas a las diferentes situaciones presentables o que se podri\u301an presentar \u2013 los efectos del negocio condicional a veces se producen ex nunc, otras, en cambio, ex tunc. Esta u\u301ltima solucio\u301n, reforzada por la autoridad de Bartolo, fue receptada por el Code Napole\u301on, mientras que la primera, mediante el ltro de la pandecti\u301stica, fue adoptada por el BgB. De ahi\u301 la duplicidad de soluciones, ambas atribuibles a la experiencia de la ciencia juri\u301dica romana

    The Plant Cell: Aspects of Its Form and Function

    No full text
    corecore